Dirichlet negative multinomial regression for overdispersed correlated count data
نویسندگان
چکیده
A generic random effects formulation for the Dirichlet negative multinomial distribution is developed together with a convenient regression parameterization. A simulation study indicates that, even when somewhat misspecified, regression models based on the Dirichlet negative multinomial distribution have smaller median absolute error than generalized estimating equations, with a particularly pronounced improvement when correlation between observations in a cluster is high. Estimation of explanatory variable effects and sources of variation is illustrated for a study of clinical trial recruitment.
منابع مشابه
Latent feature regression for multivariate count data
We consider the problem of regression on multivariate count data and present a Gibbs sampler for a latent feature regression model suitable for both underand overdispersed response variables. The model learns countvalued latent features conditional on arbitrary covariates, modeling them as negative binomial variables, and maps them into the dependent count-valued observations using a Dirichlet-...
متن کاملRobust Estimation and Outlier Detection for Overdispersed Multinomial Models of Count Data
Robust Estimation and Outlier Detection for Overdispersed Multinomial Models of Count Data We develop a robust estimator—the hyperbolic tangent (tanh) estimator—for overdispersed multinomial regression models of count data. The tanh estimator provides accurate estimates and reliable inferences even when the specified model is not good for as much as half of the data. Seriously ill-fitted counts...
متن کاملEstimation of Count Data using Bivariate Negative Binomial Regression Models
Abstract Negative binomial regression model (NBR) is a popular approach for modeling overdispersed count data with covariates. Several parameterizations have been performed for NBR, and the two well-known models, negative binomial-1 regression model (NBR-1) and negative binomial-2 regression model (NBR-2), have been applied. Another parameterization of NBR is negative binomial-P regression mode...
متن کاملScalable Bayesian Non-negative Tensor Factorization for Massive Count Data
We present a Bayesian non-negative tensor factorization model for count-valued tensor data, and develop scalable inference algorithms (both batch and online) for dealing with massive tensors. Our generative model can handle overdispersed counts as well as infer the rank of the decomposition. Moreover, leveraging a reparameterization of the Poisson distribution as a multinomial facilitates conju...
متن کاملEstimating error models for whole genome sequencing using mixtures of Dirichlet-multinomial distributions
Motivation Accurate identification of genotypes is an essential part of the analysis of genomic data, including in identification of sequence polymorphisms, linking mutations with disease and determining mutation rates. Biological and technical processes that adversely affect genotyping include copy-number-variation, paralogous sequences, library preparation, sequencing error and reference-mapp...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره 14 شماره
صفحات -
تاریخ انتشار 2013